Search Result

Select

Test case prioritization approach based on historical data and multi-objective optimization

LI Xingjia, YANG Qiuhui, HONG Mei, PAN Chunxia, LIU Ruihang

Journal of Computer Applications 2023, 43 (1): 221-226. DOI: 10.11772/j.issn.1001-9081.2021112015

Abstract （336）

HTML （15）

PDF （1305KB）（170）

Save

To improve the error detection efficiency and the benefit of regression testing of test case sequence， a test case prioritization approach based on historical data and multi-objective optimization was proposed. Firstly， the test case set was clustered according to the text topic similarity and code coverage similarity of test cases， and the association rules were mined for execution failure relationships between test cases according to the historical execution information， thereby preparing for the subsequent process. Then， the multi-objective optimization algorithm was used to sort the test cases in each cluster. After that， the final sorting sequence was generated to separate the similar test cases. Finally， the association rules between test cases were used to dynamically adjust the execution order of test cases， so that the test cases that may fail were executed with priority， so as to further improve the efficiency of defect detection. Compared with random search approach， the approach based on clustering， the approach based on topic model， the approach based on association rules and multi-objective optimization， the proposed approach has the average value of Average Percentage of Faults Detected （APFD） increased by 12.59%， 5.98%， 3.01% and 2.95%， respectively， and has the average value of APFD cost-cognizant （APFDc） increased by 17.17%， 5.04%， 5.08% and 8.21%， respectively. Experimental results show that the proposed approach can improve the benefit of regression testing effectively.

Reference | Related Articles | Metrics

Select

Data preprocessing method in software defect prediction

PAN Chunxia, YANG Qiuhui, TAN Wukun, DENG Huixin, WU Jia

Journal of Computer Applications 2020, 40 (11): 3273-3279. DOI: 10.11772/j.issn.1001-9081.2020040464

Abstract （421）

PDF （691KB）（601）

Save

Software defect prediction is a hot research topic in the field of software quality assurance. The quality of defect prediction models is closely related to the training data. The datasets used for defect prediction mainly have the problems of data feature selection and data class imbalance. Aiming at the problem of data feature selection, common process features of software development and the newly proposed extended process features were used, and then the feature selection algorithm based on clustering analysis was used to perform feature selection. Aiming at the data class imbalance problem, an improved Borderline-SMOTE (Borderline-Synthetic Minority Oversampling Technique) method was proposed to make the numbers of positive and negative samples in the training dataset relatively balanced, and make the characteristics of the synthesized samples more consistent with the actual sample characteristics. Experiments were performed by using the open source datasets of projects such as bugzilla and jUnit. The results show that the used feature selection algorithm can reduce the model training time by 57.94% while keeping high F-measure value of the model; compared to the defect prediction model obtained by using the original method to process samples, the model obtained by the improved Borderline-SMOTE method respectively increase the Precision, Recall, F-measure, and AUC (Area Under the Curve) by 2.36 percentage points, 1.8 percentage points, 2.13 percentage points and 2.36 percentage points on average; the defect prediction model obtained by introducing the extended process features has an average improvement of 3.79% in F-measure value compared to the model without the extended process features; compared with the models obtained by methods in the literatures, the model obtained by the proposed method has an average increase of 15.79% in F-measure value. The experimental results prove that the proposed method can effectively improve the quality of the defect prediction model.

Reference | Related Articles | Metrics

Select

Novel K-medoids clustering algorithm based on breadth-first search

YAN Hongwen, ZHOU Yamei, PAN Chu

Journal of Computer Applications 2015, 35 (5): 1302-1305. DOI: 10.11772/j.issn.1001-9081.2015.05.1302

Abstract （554）

PDF （626KB）（611）

Save

Due to the disadvantages such as sensitivity to the initial selection of the center, random selection of centers and poor accuracy in traditional K-medoids clustering algorithm, a breadth-first search strategy for centers was proposed on the basis of granular computing effective initialization. The new algorithm selected K granules firstly using granular computing and selected their corresponding centers as the K initial centers. Secondly, according to the similarity between objects, the proposed algorithm set up binary tree of similar objects separately where the corresponding initial centers were taken as the root nodes, and then used breadth-first search to traverse the binary tree to find out K optimal centers. What's more, the fitness function was optimized by using within-cluster distance and between-cluster distance. The experimental results on standard data set Iris and Wine in UCI show that this proposed algorithm effectively reduces the number of iterations and guarantees the accuracy of clustering at the same time.

Reference | Related Articles | Metrics

Select

Improved K-medoids clustering algorithm based on improved granular computing

PAN Chu LUO Ke

Journal of Computer Applications 2014, 34 (7): 1997-2000. DOI: 10.11772/j.issn.1001-9081.2014.07.1997

Abstract （197）

PDF （632KB）（517）

Save

Due to the disadvantages such as sensitive to the initial selection of the center, slow convergent speed and poor accuracy in traditional K-medoids clustering algorithm, a novel K-medoids algorithm based on improved Granular Computing (GrC), granule iterative search strategy and a new fitness function was proposed in this paper. The algorithm selected K granules using the granular computing thinking in the high-density area which were far apart, selected its center point as the K initial cluster centers, and updated K center points in candidate granules to reduce the number of iterations. What's more, a new fitness function was presented based on between-class distance and within-class distance to improve clustering accuracy. Tested on a number of standard data sets in UCI, the experimental results show that this new algorithm reuduces the number of iterations effectively and improves the accuracy of clustering.

Reference | Related Articles | Metrics